The pipelines are run on the server periodically and based on pipeline and data dependencies.
You can also run specific pipelines manually for development or to run custom pipelines.
In [1]:
import os
os.chdir('..')
os.getcwd()
Out[1]:
In [25]:
!{'dpp'}
In [3]:
!{'dpp run --verbose ./committees/kns_committee'}
In [16]:
KNS_COMMITTEE_DATAPACKAGE_PATH = './data/committees/kns_committee/datapackage.json'
Each package may contain multiple resources, let's see which resource names are available for the kns_committee package
In [17]:
from datapackage import Package
kns_committee_package = Package(KNS_COMMITTEE_DATAPACKAGE_PATH)
kns_committee_package.resource_names
Out[17]:
In [18]:
KNS_COMMITTEE_RESOURE_NAME = 'kns_committee'
Inspect the kns_committee resource descriptor which includes metadata and field descriptions
In [19]:
import yaml
print(yaml.dump(package.get_resource(KNS_COMMITTEE_RESOURE_NAME).descriptor,
allow_unicode=True, default_flow_style=False))
Print the first 5 row of data
In [23]:
for i, row in enumerate(package.get_resource(KNS_COMMITTEE_RESOURE_NAME).iter(keyed=True), 1):
if i > 5: continue
print(f'-- row {i} --')
print(yaml.dump(row, allow_unicode=True, default_flow_style=False))
In [ ]: